16 research outputs found

    Efficient and Secure Implementations of Lightweight Symmetric Cryptographic Primitives

    Get PDF
    This thesis is devoted to efficient and secure implementations of lightweight symmetric cryptographic primitives for resource-constrained devices such as wireless sensors and actuators that are typically deployed in remote locations. In this setting, cryptographic algorithms must consume few computational resources and withstand a large variety of attacks, including side-channel attacks. The first part of this thesis is concerned with efficient software implementations of lightweight symmetric algorithms on 8, 16, and 32-bit microcontrollers. A first contribution of this part is the development of FELICS, an open-source benchmarking framework that facilitates the extraction of comparative performance figures from implementations of lightweight ciphers. Using FELICS, we conducted a fair evaluation of the implementation properties of 19 lightweight block ciphers in the context of two different usage scenarios, which are representatives for common security services in the Internet of Things (IoT). This study gives new insights into the link between the structure of a cryptographic algorithm and the performance it can achieve on embedded microcontrollers. Then, we present the SPARX family of lightweight ciphers and describe the impact of software efficiency in the process of shaping three instances of the family. Finally, we evaluate the cost of the main building blocks of symmetric algorithms to determine which are the most efficient ones. The contributions of this part are particularly valuable for designers of lightweight ciphers, software and security engineers, as well as standardization organizations. In the second part of this work, we focus on side-channel attacks that exploit the power consumption or the electromagnetic emanations of embedded devices executing unprotected implementations of lightweight algorithms. First, we evaluate different selection functions in the context of Correlation Power Analysis (CPA) to infer which operations are easy to attack. Second, we show that most implementations of the AES present in popular open-source cryptographic libraries are vulnerable to side-channel attacks such as CPA, even in a network protocol scenario where the attacker has limited control of the input. Moreover, we describe an optimal algorithm for recovery of the master key using CPA attacks. Third, we perform the first electromagnetic vulnerability analysis of Thread, a networking stack designed to facilitate secure communication between IoT devices. The third part of this thesis lies in the area of side-channel countermeasures against power and electromagnetic analysis attacks. We study efficient and secure expressions that compute simple bitwise functions on Boolean shares. To this end, we describe an algorithm for efficient search of expressions that have an optimal cost in number of elementary operations. Then, we introduce optimal expressions for first-order Boolean masking of bitwise AND and OR operations. Finally, we analyze the performance of three lightweight block ciphers protected using the optimal expressions

    A Lightweight Implementation of NTRU Prime for the Post-Quantum Internet of Things

    Get PDF
    The dawning era of quantum computing has initiated various initiatives for the standardization of post-quantum cryptosystems with the goal of (eventually) replacing RSA and ECC. NTRU Prime is a variant of the classical NTRU cryptosystem that comes with a couple of tweaks to minimize the attack surface; most notably, it avoids rings with "worrisome" structure. This paper presents, to our knowledge, the first assembler-optimized implementation of Streamlined NTRU Prime for an 8-bit AVR microcontroller and shows that high-security lattice-based cryptography is feasible for small IoT devices. An encapsulation operation using parameters for 128-bit post-quantum security requires 8.2 million clock cycles when executed on an 8-bit ATmega1284 microcontroller. The decapsulation is approximately twice as costly and has an execution time of 15.6 million cycles. We achieved this performance through (i) new low-level software optimization techniques to accelerate Karatsuba-based polynomial multiplication on the 8-bit AVR platform and (ii) an efficient implementation of the coefficient modular reduction written in assembly language. The execution time of encapsulation and decapsulation is independent of secret data, which makes our software resistant against timing attacks. Finally, we assess the performance one could theoretically gain by using a so-called product-form polynomial as part of the secret key and discuss potential security implications

    Efficient Implementation of the SHA-512 Hash Function for 8-bit AVR Microcontrollers

    Get PDF
    SHA-512 is a member of the SHA-2 family of cryptographic hash algorithms that is based on a Davies-Mayer compression function operating on eight 64-bit words to produce a 512-bit digest. It provides strong resistance to collision and preimage attacks, and is assumed to remain secure in the dawning era of quantum computers. However, the compression function of SHA-512 is challenging to implement on small 8 and 16-bit microcontrollers because of their limited register space and the fact that 64-bit rotations are generally slow on such devices. In this paper, we present the first highly-optimized Assembler implementation of SHA-512 for the ATmega family of 8-bit AVR microcontrollers. We introduce a special optimization technique for the compression function based on a duplication of the eight working variables so that they can be more efficiently loaded from RAM via the indirect addressing mode with displacement (using the ldd and std instruction). In this way, we were able to achieve high performance without unrolling the main loop of the compression function, thereby keeping the code size small. When executed on an 8-bit AVR ATmega128 microcontroller, the compression function takes slightly less than 60k clock cycles, which corresponds to a compression rate of roughly 467 cycles per byte. The binary code size of the full SHA-512 implementation providing a standard Init-Update-Final (IUF) interface amounts to approximately 3.5 kB

    Argon2: New Generation of Memory-Hard Functions for Password Hashing and Other Applications

    Get PDF
    We present a new hash function Argon2, which is oriented at protection of low-entropy secrets without secret keys. It requires a certain (but tunable) amount of memory, imposes prohibitive time-memory and computation-memory tradeoffs on memory-saving users, and is exceptionally fast on regular PC. Overall, it can provide ASIC-and botnet-resistance by filling the memory in 0.6 cycles per byte in the non-compressible way

    Side-Channel Attacks meet Secure Network Protocols

    Get PDF
    Side-channel attacks are powerful tools for breaking systems that implement cryptographic algorithms. The Advanced Encryption Standard (AES) is widely used to secure data, including the communication within various network protocols. Major cryptographic libraries such as OpenSSL or ARM mbed TLS include at least one implementation of the AES. In this paper, we show that most implementations of the AES present in popular open-source cryptographic libraries are vulnerable to side-channel attacks, even in a network protocol scenario when the attacker has limited control of the input. We present an algorithm for symbolic processing of the AES state for any input configuration where several input bytes are variable and known, while the rest are fixed and unknown as is the case in most secure network protocols. Then, we classify all possible inputs into 25 independent evaluation cases depending on the number of bytes controlled by attacker and the number of rounds that must be attacked to recover the master key. Finally, we describe an optimal algorithm that can be used to recover the master key using Correlation Power Analysis (CPA) attacks. Our experimental results raise awareness of the insecurity of unprotected implementations of the AES used in network protocol stacks

    Efficient Masking of ARX-Based Block Ciphers Using Carry-Save Addition on Boolean Shares

    Get PDF
    Masking is a widely-used technique to protect block ciphers and other symmetric cryptosystems against Differential Power Analysis (DPA) attacks. Applying masking to a cipher that involves both arithmetic and Boolean operations requires a conversion between arithmetic and Boolean masks. An alternative approach is to perform the required arithmetic operations (e.g. modular addition or subtraction) directly on Boolean shares. At FSE 2015, Coron et al. proposed a logarithmic-time algorithm for modular addition on Boolean shares based on the Kogge-Stone carry-lookahead adder. We revisit their addition algorithm in this paper and present a fast implementation for ARM processors. Then, we introduce a new technique for direct modular addition/subtraction on Boolean shares using a simple Carry-Save Adder (CSA) in an iterative fashion. We show that the average complexity of CSA-based addition on Boolean shares grows logarithmically with the operand size, similar to the Kogge-Stone carry-lookahead addition, but consists of only a single AND, an XOR, and a left-shift per iteration. A 32-bit CSA addition~on Boolean shares has an average execution time of 162 clock cycles on an ARM Cortex-M3 processor, which is approximately 43% faster than the Kogge-Stone adder. The performance gain increases to over 55% when comparing the average subtraction times. We integrated both addition techniques into a masked implementation of the block cipher Speck and found that the CSA-based variant clearly outperforms its Kogge-Stone counterpart by a factor of 1.70 for encryption and 2.30 for decryption

    Optimal First-Order Boolean Masking for Embedded IoT Devices

    Get PDF
    Boolean masking is an effective side-channel countermeasure that consists in splitting each sensitive variable into two or more shares which are carefully manipulated to avoid leakage of the sensitive variable. The best known expressions for Boolean masking of bitwise operations are relatively compact, but even a small improvement of these expressions can significantly reduce the performance penalty of more complex masked operations such as modular addition on Boolean shares or of masked ciphers. In this paper, we present and evaluate new secure expressions for performing bitwise operations on Boolean shares. To this end, we describe an algorithm for efficient search of expressions that have an optimal cost in number of elementary operations. We show that bitwise AND and OR on Boolean shares can be performed using less instructions than the best known expressions. More importantly, our expressions do no require additional random values as the best known expressions do. We apply our new expressions to the masked addition/subtraction on Boolean shares based on the Kogge-Stone adder and we report an improvement of the execution time between 14% and 19%. Then, we compare the efficiency of first-order masked implementations of three lightweight block ciphers on an ARM Cortex-M3 to determine which design strategies are most suitable for efficient masking. All our masked implementations passed the t-test evaluation and thus are deemed secure against first-order side-channel attacks

    Triathlon of Lightweight Block Ciphers for the Internet of Things

    Get PDF
    In this paper, we introduce a framework for the benchmarking of lightweight block ciphers on a multitude of embedded platforms. Our framework is able to evaluate the execution time, RAM footprint, as well as binary code size, and allows one to define a custom "figure of merit" according to which all evaluated candidates can be ranked. We used the framework to benchmark implementations of 19 lightweight ciphers, namely AES, Chaskey, Fantomas, HIGHT, LBlock, LEA, LED, Piccolo, PRESENT, PRIDE, PRINCE, RC5, RECTANGLE, RoadRunneR, Robin, Simon, SPARX, Speck, and TWINE, on three microcontroller platforms: 8-bit AVR, 16-bit MSP430, and 32-bit ARM. Our results bring some new insights into the question of how well these lightweight ciphers are suited to secure the Internet of things. The benchmarking framework provides cipher designers with an easy-to-use tool to compare new algorithms with the state of the art and allows standardization organizations to conduct a fair and consistent evaluation of a large number of candidates

    Artificial Intelligence as a Substitute for Human Creativity

    Get PDF
    Creativity has always been perceived as a human trait, even though the exact neural mechanisms remain unknown, it has been the subject of research and debate for a long time. The recent development of AI technologies and increased interest in AI has led to many projects capable of performing tasks that have been previously regarded as impossible without human creativity. Music composition, visual arts, literature, and science represent areas in which these technologies have started to both help and replace the creative human, with the question of whether AI can be creative and capable of creation more realistic than ever. This review aims to provide an extensive perspective over several state-of-the art technologies and applications based on AI which are currently being implemented into areas of interest closely correlated to human creativity, as well as the economic impact the development of such technologies might have on those domains

    Argon and Argon2

    Get PDF
    This is a design specification for the functions Argon and Argon2 for the international password hashing competition (PHC), 2013-2015. Argon is our original submission to PHC. It is a multipurpose hash function, that is optimized for highest resilience against tradeoff attacks, so that any, even small memory reduction would lead to significant time and computational penalties. Argon can be used for password hashing, key derivation, or any other memory-hard computation (e.g., for cryptocurrencies). Argon2 summarizes the state of the art in the design of memory-hard functions. It is a streamlined and simple design. It aims at the highest memoryfilling rate and effective use of multiple computing units, while still providing defense against tradeoff attacks. Argon2 is optimized for the x86 architecture and exploits the cache and memory organization of the recent Intel and AMD processors. Argon2 has two variants: Argon2d and Argon2i. Argon2d is faster and uses data-depending memory access, which makes it suitable for cryptocurrencies and applications with no threats from side-channel timing attacks. Argon2i uses data-independent memory access, which is preferred for password hashing and password based key derivation. Argon2i is slower as it makes more passes over the memory to protect from tradeoff attacks. We recommend Argon for the applications that aim for the highest tradeoff resilience and want to guarantee prohibitive time and computational penalties on any memory-reducing implementation. According to our cryptanalytic algorithms, an attempt to use half of the requested memory (for instance, 64 MB instead of 128 MB) results in the speed penalty factor of 140 and in the penalty 218. The penalty grows exponentially as the available memory decreases, which effectively prohibits the adversary to use any smaller amount of memory. Such high computational penalties are a unique feature of Argon. We recommend Argon2 for the applications that aim for high performance. Both versions of Argon2 allow to fill 1 GB of RAM in a fraction of second, and smaller amounts even faster. It scales easily to the arbitrary number of parallel computing units. Its design is also optimized for clarity to ease analysis and implementation
    corecore